Fast Algorithms for Determining Protein Structure Similarity

نویسندگان

  • Somenath Biswas
  • Samarjit Chakraborty
چکیده

The problem of identifying the common three-dimensional structure between two protein molecules has received considerable attention from both the biology community and also from algorithms researchers. A number of similarity measures have been proposed so far for this purpose. Among them are the RMS distance, those based on geometric hashing, and some based on the contact map overlap. Very recently, a new measure called the bottleneck matching metric has been used as a measure of similarity between two drug or protein molecules. Although experimental studies have indicated the robustness of this metric, all the algorithms developed so far which are based on this suffer from running times which are high-degree polynomials in the number of atoms in the protein molecules, making them infeasible for practical applications. In this paper we show that by exploiting a very simple structural property of the α-Carbon backbone structures of proteins, the running time of some of these algorithms can be considerably improved. This can be further combined with some fairly standard algorithmic techniques such as randomization, and/or an approximate matching scheme for bipartite graphs. The resulting algorithms have running times which are nearly linear in the number of atoms in the proteins being compared, making the bottleneck matching measure a viable candidate for practical applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance

MOTIVATION Existing algorithms for automated protein structure alignment generate contradictory results and are difficult to interpret. An algorithm which can provide a context for interpreting the alignment and uses a simple method to characterize protein structure similarity is needed. RESULTS We describe a heuristic for limiting the search space for structure alignment comparisons between ...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

An improved opposition-based Crow Search Algorithm for Data Clustering

Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

Fast overlapping of protein contact maps by alignment of eigenvectors

MOTIVATION Searching for structural similarity is a key issue of protein functional annotation. The maximum contact map overlap (CMO) is one of the possible measures of protein structure similarity. Exact and approximate methods known to optimize the CMO are computationally expensive and this hampers their applicability to large-scale comparison of protein structures. RESULTS In this article,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001